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Abstract — A waveform channel is considered where the trans- 
mitted signal is corrupted by Wiener phase noise and additive 
white Gaussian noise (AWGN). A discrete-time channel model is 
introduced that is based on a multi-sample receiver. Tight lower 
bounds on the information rates achieved by the multi-sample 
receiver are computed by means of numerical simulations. The 
results show that oversampling at the receiver is beneficial for 
both strong and weak phase noise at high signal-to-noise ratios. 
The results are compared with results obtained when using other 
discrete-time models. 



technique proposed in and they computed upper bounds 
in iflOl . They also developed a lower bound based on Kalman 
filtering in Bill . Barbieri and Colavolpe 0] computed lower 
bounds with an auxiliary channel slightly different from [8|. 

In this paper, we study a waveform channel corrupted by 
Wiener phase noise and AWGN: 



r(t) = x(t) e jm +n(t), for t € 



(2) 
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I. Introduction 

Communication systems often suffer from phase noise that 
arises, e.g., due to the instability of RF oscillators in satellite 
H] or microwave links 0. In optical fiber communication, 
phase noise arises due to the instability of laser oscillators 
O or due to cross-phase modulation (XPM) in Wavelength- 
Division-Multiplexing (WDM) systems (4). 

The nature of the phase noise depends on the application. 
A commonly studied discrete-time model is 



Y k = X 



symb, k 



z k 



(i) 



where {Yk} are the output symbols, {X sym b t h} are the input 
symbols, {&k} is the phase noise process and {Zk} is additive 
white Gaussian noise (AWGN). For example, Katz and Shamai 
(5) studied the model (Q3 when {©&} is independent and 
identically distributed (i.i.d.) according to pe(-), when is 
uniformly distributed (called a noncoherent AWGN channel) 
and when has a Tikhonov (or von Mises) distribution 
(called a partially-coherent AWGN channel). Tikhonov phase 
noise models the residual phase error in systems with phase- 
tracking devices, e.g., phase-locked loops (PLL) and ideal 
interleavers/deinterlevers. 

Tight lower bounds on the capacities of memoryless nonco- 
herent and partially coherent AWGN channels were computed 
by solving an optimization problem numerically in [5| and 
J6), respectively. Dauwels and Loeliger [7 | proposed a particle 
filtering method to compute information rates for discrete- 
time continuous-state channels with memory and applied the 
method to ([TJ for Wiener phase noise and autoregressive- 
moving-average (ARMA) phase noise. Barletta, Magarini and 
Spalvieri [8| computed lower bounds on information rates for 
with Wiener phase noise by using the auxiliary channel 



where x(t) and r(t) are the transmitted and received signals, 
respectively, while n(t) and 9{t) are the additive and phase 
noise, respectively. A detailed description of the model is given 
in Sec. [TT] This model is reasonable, for example, for optical 
fiber communication with low to intermediate power and laser 
phase noise, see 0. As pointed out in [12], the discrete-time 
model (HJ does not fit the channel (ffjl because filtering a phase- 
varying signal with a constant amplitude gives rise to an output 
with a varying amplitude. The effect of filtering persists for 
phase impairments other than Wiener phase noise, e.g., for 
XPM in optical fiber lfl3l . We developed in lfP21 a discrete- 
time channel model based on a multi-sample receiver, i.e., a 
filter whose output is sampled multiple times per symbol. 

In this paper, we use techniques based on |9) to compute 
tight lower bounds on the information rates for the multi- 
sample receiver introduced in ifTZl . The paper is organized 
as follows. The continuous-time model is described in Sec. 
HT1 and the discrete-time model of the multi-sample receiver is 
described in Sec. [TTIJ We develop a method to compute lower 
bounds on the information rates of a multi-sample receiver 
in Sec. [IV] In Sec. |VJ we report the results of numerical 
simulations and Sec. [VI] concludes the paper. 



II. Continuous-Time Model 

We use the following notation: j = , * denotes the 

complex conjugate, 8d is the Dirac delta function, [•] is the 
ceiling operator. We use X k to denote (Xx,X%, . . . , Xk). Sup- 
pose the transmit-waveform is x(t) and the receiver observes 



r(t) = x{t) e je ^ + n (t) 



(3) 



where n(t) is a realization of a white circularly-symmetric 
complex Gaussian process N(t) with 

E [N(t)} = 

E[N(t 1 )N*(t 2 )] = o% Snih-tx). (4) 
The phase 0(t) is a realization of a Wiener process <d(t): 



The signal y(t) is a realization of F(t) that is sampled at 
t = fcA, k = 1, . . . , n = n symb I/, to yield the discrete-time 
model: 



Y k = X s . 



I A e' 6k F k + N k 



&{t) = 6(0) + / W(r)dT 



(5) F, 



ymb, \k/L~\ 

where F fe = F(fcA), 9 fc = 6((fc - 1)A) 

feA 

5 I r 



(13) 



1 



(fc-l)A 



where 0(0) is uniform on [— tt, tt) and W(t) is a real Gaussian 
process with 



and 



E [W(t)] = 

E\W(t 1 )Wfa)]=2vP S D (t 2 -h). 



(6) 



N k = 



AA 



N(t) dr. 



(14) 



(15) 



(fe-l)A 



The processes N(t) and Q(t) are independent of each other The process {N k } is an i.i.d. circularly-symmetric complex 



and independent of the input. No — 2a 2 N is the single-sided 
power spectral density of the additive noise. We define U(i) = 
exp(j0(i)). The autocorrelation function of U(t) is 

R u (t u t 2 )=E[U(t 1 )U*(t 2 )]=cxp{-TrP\t 2 -t 1 \) (7) 

and the power spectral density of U(t) is 

0/2 



/OO 
Ru(t,t + T) e^ fT dr = 
-OO 



r- 



(8) 



The spectrum is said to have a Lorentzian shape. It is easy 
to show that /3 = /fwhm = 2/ HW hm where /fwhm is the 
full-width at half-maximum and /hwhm is the half-width at 
half-maximum. Let T be the transmission interval, then the 
transmitted waveforms must satisfy the power constraint 



E 



\X{t)\ 2 dt 



< V 



(9) 



where X(t) is a random process whose realization is x(t). 

III. Discrete-Time Model 

Let (a; sym b,i,x symb ,i,...,a; sym b, fl , ym J be the codeword sent 
by the transmitter. Suppose the transmitter uses a unit-energy 
pulse g{t) whose time support is [0, T S ymb] where T sym b is the 
symbol interval. The waveform sent by the transmitter is 

' '■syinb 

x(t) = ^ ^symb.m d(t ~ (m - l)T symb ). (10) 
ra— 1 

Let L be the number of samples per symbol (L > 1) and 
define the sample interval as 



A 



-^symb 

L 



(11) 



The received waveform r(t) is filtered using an integrator 
over a sample interval to give the output signal 



y(t) = 



t-A 



"(r) dr. 



(12) 



Gaussian process with mean and E[|7Vj,| 2 ] = er^A while 
the process {<d k } is the discrete-time Wiener process: 

6 fc = 6 fc _i + W k mod 2tt (16) 

for k = 2, . . . , n, where 9i is uniform on [—tt, tt) and {Wit} 
is an i.i.d. real Gaussian process with mean and IE [ | | 2 ] = 
27T/3A, i.e., the probability distribution function (pdf) of W k 

is p Wk (w) = G(w; 0, a^) where 



G(w;fj,,a 2 ) = 



1 



: exp 



(w - [if 



(17) 



and afy — 2tt/3A. The random variable (W k mod 2tt) is a 



V 2a 2 
variable 

wrapped Gaussian and its pdf is pw(w; OyA where 

OO 

pw(w;<r 2 )= G(w-2i7r;0,CT 2 ). (18) 

i— — oo 

Moreover, {F k } and are independent of {N k } but not 

independent of each other. Finally, equations (0 and ([TUt 
imply the power constraint 



^ '^symb 

V E[|X symb , m | 2 ] < P = VT symb . 

J m— 1 



(19) 



It is convenient to define X k as 

X k = X(kA) = X symbtlk/L] g ((k mod L)A) . (20) 

It follows that I(X^;Y n ) = I(X n ;Y n ). We define the 
information rate 



I(X; Y) = Km 



1 



-I{X n -Y n ). 



(21) 



"symb— >°0 7l symb 

One difficuly in evaluating d2Tb is that the joint distribution 
of {Fk} and is not available in closed form. Even the 

distribution of F k is not available in closed form (there is an 
approximation for small linewidth, see (16) in J3]). However, 
we can numerically compute tight lower bounds on I(X;Y) 
by using the auxiliary-channel technique described next. 



IV. Lower Bound 

The Auxiliary-Channel Lower Bound Theorem in ||9] Sec. 
VI] states that for two random variables X and Y, we have 



I(X;Y) > I(X;Y) 



log 



q Y \x(Y\X) 



(22) 



Qy(Y) 

where qY\x('\') lS an arbitrary auxiliary channel and 

Qy(y) = ^2px(x)q Y \x(y\x) (23) 
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Fig. 1. Bayesian network for X", S", T" for n = 9. 



where px is the true distribution of X. The distribution qy(-) 
is thus the output distribution obtained by connecting the true 
input source to the auxiliary channel. Using this theorem, we 
can compute a lower bound on I(X; Y) by using the following 
algorithm J5J: 

1) Sample a long sequence (x n ,y n ) according to the true 
joint distribution of X n and Y n . 

2) Compute qy^\x n (y n \x n ) an d 

QY~(y n ) = E^"^")^"!*"^") (24) 

where px» is the true distribution of X n . 

3) Estimate I_(X;Y) using 



UX;Y) 



1 . ( q Y ~\x~{y n \x n ) 
log ' 



(25) 



«symb ' v qv"{y n ) 

Auxiliary Channel I: Consider the auxiliary channel 

* fc = X k A e j6fc + N k (26) 

where {0fc} and {Nk} are defined in Sec. [HI] and X/. is 
defined by (1201 . The channel "J is the same as F in (fT~3T > 
except that i 7 ^ is replaced with g ((fc mod £)A). The channel 
is described by the conditional distribution p^^\x n 



P^\x»(y n \x n )= [ Pe», ^\x40 n ,y n \x n ) W 



(27) 



where 



p en ^ lX n(9 n ,y n \x n ) 
= nPe k \e k -A6k\0k-i) PM,\x,e(yk\x k ,6k) 



k=l 



with 



Pw(P-0;ow), k>2 
1/(2tt), fc = 1 



and 



P<st\x,&{v\n,6) = 



na 2 N A 



exp 



y — x e J 



(28) 



(29) 



(30) 



The channel p^nix™ nas continuous states 0™, which makes 
step 2 of the algorithm computationally infeasible. 



Auxiliary Channel II: We use the following auxiliary 
channel for the numerical simulations: 



T fe = X k A e 3Sk + N k 
which has the conditional probability 

Pm\ X n(y n \x n ) = £ PS n,rn {x 4s n ,y n \x n ) 

where S is a finite set and 

PSr., r n\ X n(s n ,y n \x n ) 

n 

= YlPSkUS^A^lsk-i) P*\x,e(yk\xk,s k ) 



(31) 



(32) 



fc=i 



where 



Ps k \s k -M s ) 



Q(s\s), k>2 
1/151, fe = l. 



(33) 



(34) 



Next, we describe our choice of 5 and Q(-|-). We partition 
[— 7r, 7r) into S intervals with equal lengths and pick the mid 
points of these intervals to be the elements of S, i.e., we have 

2n 7r 

S = {Si :i = l,...,S} where Sj = - — - 7r. (35) 

The state transition probability Q(-\-) is chosen similar to [8 1 
and ED: 



Q(s\S) 



2n 
~S 



Pw( 



(36) 



where IZ(s) = [s — n/S,s + 71-/5), i.e., IZ(s) is the interval 
whose midpoint is s. The larger 5 and L are, the better the 
auxiliary channel OTb approximates the actual channel (I131 l. 
We remark that even for small 5 and L, the auxiliary channel 
gives a vaZ/c/ lower bound on I(X; Y). 

A. Computing The Conditional Probability 

Suppose the input X n has the distribution p X n. A Bayesian 
network for X n , 5™, T" is shown in Fig. Q] The probability 
Pr"\x n (y n \x n ) can be computed using 



PT»|X"(y n K) = $>n(s) 



(37) 
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Fig. 2. Bayesian network for X n , S n , T™ for n = 9 and L = 3. 



where we recursively compute 

p k (s)=p SktT k lxn (s,y k \x n ) 

(a) ^ /- fei n\ 



(38) 



ses 



(b) 



J^Pk-iis) Ps k ,r k \s k - U r k -Kx^(s,y k \s,y k \ 



ses 



(39) 



ses 



with the initial value po{s) = 1/|<S|- Step (a) is a marginal- 
ization, (b) follows from Bayes' rule and the definition of p k 
in d38l . while (1391 follows from the structure of Fig. [TJ We 
remark that (l39l is the same as with independent X%, . . . , X„, 
e.g., see equation (9) in [14, Sec. IV]. 

B. Computing The Marginal Probability 

Define Y m = (Y( m -i)L+ii Y(m-i)L+2, • • • > Y( m -i)L+L) 
and X TO = {X( m _ 1 - )L+1 ,X( m _ 1 - )L+2 , ■ • ■ i^(m-i)L+i)- Sup- 
pose the input symbols are i.i.d. and X;ymb,m € X where X 
is a finite set. Therefore, px n has the form 

px™(x n )= l[px(x m ). (40) 

m— 1 

A Bayesian network for X™, S n , T™ is shown in Fig. [2] The 
probability px™ (y n ) can be computed using 



(41) 



where ij) m (s) = Ps mL: \™ (s, y m ) which can be computed 
using the recursion: 

^m(s) (42) 

= px & X^™- 1 ^) Ps mL ,\ m \s (m - 1)L ,x m (s,y m \s,x) 
iex L ses 

with the initial value ^o( s ) = l/l<5|- The set Xl is 

X L = {x- ( 5 (A), g(2A), . . . ,g(LA)) : x £ X}. (43) 
We remark that \X L \ = \X\ and not \X\ L . Next, we define 

Xm,L{s,S,x) = PS mL ,Y m \S (m _ 1)L ,X m (s, V m | S, x) (44) 

for s,s £ S and x £ Xl, Computing Xm,L(s,s,x) is similar 
to computing p n (see (|39l). Intuitively, this is because a block 



of L samples in Fig. [2] has a structure similar to Fig. [TJ More 
precisely, Xm,L(s, s,x) can be computed recursively by using 

Xm,z(s,s,x) (45) 
= 22xm,e-i{^,s,x) Q{s\q) p^\x,&{V( m -i)L+t\xi,s) 



with the initial value 

Xm,o(s, S, X) 



1, s = s 

0, otherwise. 



(46) 



Therefore, computing px n (y n ) involves two levels of recur- 
sion: 1) recursion over the symbols as described by (l42l and 
2) recursion over the samples within a symbol as described 
by 



V. Numerical Simulations 

We use two pulses with a symbol-interval time support: 
• A unit-energy square pulse 

1 



where 



9i (*) 



rect(i) 



= rect 



t 



symb 



1, |*| < 1/2, 
0. otherwise. 



A unit-energy cosine-squared pulse 



92 (*) 



1 



\/?symb/2 



: COS 



7Tf 



T, 



symb 



rect 



symb 



(47) 



(48) 



(49) 



The first step of the algorithm is to sample a long sequence 
according to the true joint distribution of X n and Y n . To 
generate samples according to the original channel ( [T3l , 
we must accurately represent digitally the continuous-time 
waveform (01. We use a simulation oversampling rate L s j m = 
1024 samples/symbol. After the filter (fL2l . the receiver has L 
samples/symbol distributed according to ([T3l . Next, to choose 
a proper sequence length, we follow the approach suggested in 
0: for a candidate length, run the algorithm about 10 times 
(each with a new random seed) and check whether all esti- 
mates of the information rate agree up to the desired accuracy. 
We used n sym b — 10 4 unless otherwise stated. We define the 
signal-to-noise ratio as SNR = P / cr^T^^ = V '/<?%■ 



For efficient implementation of (1391 . p\£\x,@ (•{•■, ") can be 
factored out of the summation to yield: 



p' k ( s ) 



Pk{s) = p^\x,e(yk\ x k, s) y^pfc-i(a) Q(s\s) 



/ j 1 

ses 



(50) 



Moreover, since Q{-\ ) can be represented by a circulant matrix 
due to symmetry, p' k (-) can be computed efficiently using the 
Fast Fourier Transform (FFT). Similarly, the computation of 
( l45b can be done efficiently by factoring out p*|x,e("|*i ') an d 
by using the FFT. 




Fig. 3. Lower bounds on rates for 16-QAM, square transmit-pulse and multi- 
sample receiver at fmmMT sym b = 0.125. 



Fig. 4. Lower bounds on rates for 16-QAM, cosine-squared transmit-pulse 
and multi-sample receiver at /HWHMT' sym b = 0.125. 



A. Excessively Large Linewidth 

Suppose /HWHM^'symb = 0.125 and the input symbols are 
independently and uniformly distributed (i.u.d.) 16-QAM. Fig. 
13 shows an estimate of I_(X;Y) for a square transmit-pulse, 
i.e., g(t) = gi(t — T sym b/2) and an L-sample receiver with 
L = 4,8,16 and S = 16,32,64. The curves with S = 64 
are indistinguishable from the curves with S = 32 over the 
entire SNR range for all values of L, and hence S — 32 is 
adequate up to 25 dB. Even S = 16 is adequate up to 20 dB. 
The important trend in Fig. [3] is that higher oversampling rate 
L is needed at high SNR to extract all the information from 
the received signal. For example, L — 4 suffices up to SNR 
~ 10 dB, L = 8 suffices up to SNR - 15 dB but L > 16 is 
needed beyond that. It was pointed out in [9| that the lower 
bounds can be interpeted as the information rates achieved by 
mismatched decoding. For example, I_{X; Y) for L = 8 and 
S > 32 in Fig. [3] is essentially the information rate achieved 
by a multi-sample (8-sample) receiver that uses maximum- 
likelihood decoding for the simplified channel (f26t when it is 
operated in the original channel (fT3l . 

Fig. H] shows an estimate of I_{X; Y) for a cosine-squared 
transmit-pulse, i.e., g(t) — g%(t — T sym h/2) and an L-sample 
receiver at L = 4,8,16 and S = 16,32,64. We find that 
S = 32 suffices up to ~ 25 dB. We see in Fig. [4] the same 
trend in Fig. [3] higher L is needed at higher SNR. Comparing 
Fig. [3] with Fig. [^indicates that the square pulse is better than 
the cosine-squared pulse for the same oversampling rate L. 

B. Large Linewidth 

As the linewidth decreases, the benefit of oversampling 
at the receiver becomes apparant only at higher SNR. For 
example, for /HWHiviTsymb = 0.0125 and i.u.d. 16-PSK input, 
Fig. [5] shows an estimate of I_(X\Y) for a square transmit- 
pulse and an L-sample receiver at L = 1,2,4,8,16 and 




Fig. 5. Lower bounds on rates for 16-PSK, square transmit-pulse and multi- 
sample receiver at fmrnuTaymb = 0.0125. 



S = 64. We see that L = 4 suffices up to SNR - 19 dB, 
L = 8 suffices up to SNR ~ 24 dB and only beyond that 
L > 16 is necessary. 

We conclude from Fig. [3]-[5] that the required L depends on 
1) the linewidth /fwhm of the phase noise; 2) the pulse g(t); 
and 3) the SNR. 

C. Comparison With Other Models 

We compare the discrete-time model of the multi-sample 
receiver with other discrete-time models. The simulation pa- 
rameters for our model (GK) are n sym b = 10 4 , L = 16 (with 
L sim = 1024) and S = 64 for 16-QAM (S = 128 was too 
computationally intensive) and S = 128 for QPSK. 




SNR (dB) 



Fig. 6. Comparison of information rates for different models. 

In Fig. [6j we show curves for the Baud-rate model used in 
HI and iTTll- lfTTI . The model is (Q3 where the phase noise is 
a Wiener process whose noise increments have variance j 2 . 
We set 7 2 = 27r/3X' sym b. The simulation parameters for the 
Baud-rate model are 7i sym b = 10 5 and S = 128. 

We also show curves for the Martalo-Tripodi-Raheli (MTR) 
model |[l4l in Fig. [6] For the sake of comparison, we adapt 
the model in lfT4l from a square-root raised-cosine pulse to a 
square pulse and write the "matched" filter output {V m } as 

L 

Vm — ^ *(m-l)L + l (51) 
1=1 

where m = 1, . ..,n symb and ^j. is defined in (|26t , The 
auxiliary channel is 

Y m = X symb , m e J&m + Z m , m > 1 (52) 

where the process {Z m } is an i.i.d. circularly-symmetric 
complex Gaussian process with mean and E[|Z m | 2 ] = 
C7vT sym b while the process {0 m } is a first-order Markov 
process (not a Wiener process) with a time-invariant transition 
probability, i.e., for k > 2 and all 9^,6^-1 € [—%,%), we 
havepe fc |e fc _i(^fcl^-x) = Pe 2 \e 1 (0k\h-i)- Furthermore, the 
phase space is quantized to a finite number S of states and the 
transition probabilities are estimated by means of simulation. 
The probabilities are then used to compute a lower bound on 
the information rate. The simulation parameters for the MTR 
model are n sym b = 10 5 , L — 16 and S = 128. 

We see that the Baud-rate and MTR models saturate at 
a rate well below the rate achieved by the multi-sample 
receiver. Moreover, the multi-sample receiver achieves the 
full 4 bits/symbol and 2 bits/symbol of 16-QAM and QPSK, 
respectively, at high SNR. 

VI. Conclusion 

We studied a waveform channel impaired by Wiener phase 
noise and AWGN by evaluating via numerical simulations tight 



lower bounds on the information rates achieved by a multi- 
sample receiver. We found that the required oversampling rate 
depends on the linewidth of the phase noise, the shape of 
the transmit-pulse and the signal-to-noise ratio. The results 
demonstrate that multi-sample receivers increase the informa- 
tion rate for both strong and weak phase noise at high SNR. 
We compared our results with the results obtained by using 
other discrete-time models. 
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